select best split
How to select Best Split in Decision Trees using Chi-Square
Let's see how we can calculate the expected values. If you recall this is how the split on "Performance in class" looks like- There is a total of 20 students and out of those 10 play cricket and 10 do not. So, of course, the percent of students who do play cricket will be 50%. Now if we consider the "Above average" node here, there are 14 students in it, as the percentage of students who play cricket is 50% in the parent node as we discussed, the expected number of students who play cricket will of course be 7 and if you look at the actual value it is 8. So now we have both the values expected values and actual values.